智能论文笔记

Meta-learning from Learning Curves Challenge: Lessons learned from the First Round and Design of the Second Round

Manh Hung Nguyen , Lisheng Sun , Nathan Grinsztajn , Isabelle Guyon

分类：机器学习 | 人工智能

2022-08-04

学习曲线的元学习是机器学习社区中一个重要但经常被忽视的研究领域。我们介绍了一系列基于学习的基于学习的元学习挑战，其中代理商根据来自环境的学习曲线的反馈来寻找适合给定数据集的最佳算法。第一轮吸引了学术界和工业的参与者。本文分析了第一轮的结果（被WCCI 2022的竞争计划接受），以了解使元学习者成功从学习曲线学习的东西。通过从第一轮中学到的教训以及参与者的反馈，我们通过新的协议和新的元数据设计设计了第二轮挑战。我们的第二轮挑战在2022年Automl-Conf中被接受，目前正在进行中。

translated by 谷歌翻译

SAFL: A Self-Attention Scene Text Recognizer with Focal Loss

Bao Hieu Tran , Thanh Le-Cong , Huu Manh Nguyen , Duc Anh Le , Thanh Hung Nguyen , Phi Le Nguyen

分类：计算机视觉 | 机器学习

2022-01-01

在过去的几十年中，由于其在广泛的应用中，现场文本认可从学术界和实际用户获得了全世界的关注。尽管在光学字符识别方面取得了成就，但由于诸如扭曲或不规则布局等固有问题，现场文本识别仍然具有挑战性。大多数现有方法主要利用基于复发或卷积的神经网络。然而，虽然经常性的神经网络（RNN）通常由于顺序计算而遭受慢的训练速度，并且遇到消失的梯度或瓶颈，但CNN在复杂性和性能之间衡量折衷。在本文中，我们介绍了SAFL，一种基于自我关注的神经网络模型，具有场景文本识别的焦点损失，克服现有方法的限制。使用焦损而不是负值对数似然有助于模型更多地关注低频样本训练。此外，为应对扭曲和不规则文本，我们在传递到识别网络之前，我们利用空间变换（STN）来纠正文本。我们执行实验以比较拟议模型的性能与七个基准。数值结果表明，我们的模型实现了最佳性能。

translated by 谷歌翻译

Neural Collapse in Deep Linear Network: From Balanced to Imbalanced Data

Hien Dang , Tan Nguyen , Tho Tran , Hung Tran , Nhat Ho

分类：机器学习 | (统计)机器学习

2023-01-01

Modern deep neural networks have achieved superhuman performance in tasks from image classification to game play. Surprisingly, these various complex systems with massive amounts of parameters exhibit the same remarkable structural properties in their last-layer features and classifiers across canonical datasets. This phenomenon is known as "Neural Collapse," and it was discovered empirically by Papyan et al. \cite{Papyan20}. Recent papers have theoretically shown the global solutions to the training network problem under a simplified "unconstrained feature model" exhibiting this phenomenon. We take a step further and prove the Neural Collapse occurrence for deep linear network for the popular mean squared error (MSE) and cross entropy (CE) loss. Furthermore, we extend our research to imbalanced data for MSE loss and present the first geometric analysis for Neural Collapse under this setting.

translated by 谷歌翻译

Multisensor Data Fusion for Reliable Obstacle Avoidance

Thanh Nguyen Canh , Truong Son Nguyen , Cong Hoang Quach , Xiem HoangVan , Manh Duong Phung

分类：机器人

2022-12-26

In this work, we propose a new approach that combines data from multiple sensors for reliable obstacle avoidance. The sensors include two depth cameras and a LiDAR arranged so that they can capture the whole 3D area in front of the robot and a 2D slide around it. To fuse the data from these sensors, we first use an external camera as a reference to combine data from two depth cameras. A projection technique is then introduced to convert the 3D point cloud data of the cameras to its 2D correspondence. An obstacle avoidance algorithm is then developed based on the dynamic window approach. A number of experiments have been conducted to evaluate our proposed approach. The results show that the robot can effectively avoid static and dynamic obstacles of different shapes and sizes in different environments.

translated by 谷歌翻译

Deployment of UAVs for Optimal Multihop Ad-hoc Networks Using Particle Swarm Optimization and Behavior-based Control

Ngan Duong Thi Thuy , Duy Nam Bui , Manh Duong Phung , Hung Pham Duy

分类：机器人

2022-12-26

This study proposes an approach for establishing an optimal multihop ad-hoc network using multiple unmanned aerial vehicles (UAVs) to provide emergency communication in disaster areas. The approach includes two stages, one uses particle swarm optimization (PSO) to find optimal positions to deploy UAVs, and the other uses a behavior-based controller to navigate the UAVs to their assigned positions without colliding with obstacles in an unknown environment. Several constraints related to the UAVs' sensing and communication ranges have been imposed to ensure the applicability of the proposed approach in real-world scenarios. A number of simulation experiments with data loaded from real environments have been conducted. The results show that our proposed approach is not only successful in establishing multihop ad-hoc routes but also meets the requirements for real-time deployment of UAVs.

translated by 谷歌翻译

Learning to Generate Questions by Enhancing Text Generation with Sentence Selection

Do Hoang Thai Duong , Nguyen Hong Son , Hung Le , Minh-Tien Nguyen

分类：自然语言处理

2022-12-23

We introduce an approach for the answer-aware question generation problem. Instead of only relying on the capability of strong pre-trained language models, we observe that the information of answers and questions can be found in some relevant sentences in the context. Based on that, we design a model which includes two modules: a selector and a generator. The selector forces the model to more focus on relevant sentences regarding an answer to provide implicit local information. The generator generates questions by implicitly combining local information from the selector and global information from the whole context encoded by the encoder. The model is trained jointly to take advantage of latent interactions between the two modules. Experimental results on two benchmark datasets show that our model is better than strong pre-trained models for the question generation task. The code is also available (shorturl.at/lV567).

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Improving Warped Planar Object Detection Network For Automatic License Plate Recognition

Nguyen Dinh Tra , Nguyen Cong Tri , Phan Duy Hung

分类：计算机视觉 | 人工智能

2022-12-14

This paper aims to improve the Warping Planer Object Detection Network (WPOD-Net) using feature engineering to increase accuracy. What problems are solved using the Warping Object Detection Network using feature engineering? More specifically, we think that it makes sense to add knowledge about edges in the image to enhance the information for determining the license plate contour of the original WPOD-Net model. The Sobel filter has been selected experimentally and acts as a Convolutional Neural Network layer, the edge information is combined with the old information of the original network to create the final embedding vector. The proposed model was compared with the original model on a set of data that we collected for evaluation. The results are evaluated through the Quadrilateral Intersection over Union value and demonstrate that the model has a significant improvement in performance.

translated by 谷歌翻译

Efficient Graph Neural Network Inference at Large Scale

Xinyi Gao , Wentao Zhang , Yingxia Shao , Quoc Viet Hung Nguyen , Bin Cui , Hongzhi Yin

分类：机器学习

2022-11-01

Graph neural networks (GNNs) have demonstrated excellent performance in a wide range of applications. However, the enormous size of large-scale graphs hinders their applications under real-time inference scenarios. Although existing scalable GNNs leverage linear propagation to preprocess the features and accelerate the training and inference procedure, these methods still suffer from scalability issues when making inferences on unseen nodes, as the feature preprocessing requires the graph is known and fixed. To speed up the inference in the inductive setting, we propose a novel adaptive propagation order approach that generates the personalized propagation order for each node based on its topological information. This could successfully avoid the redundant computation of feature propagation. Moreover, the trade-off between accuracy and inference latency can be flexibly controlled by simple hyper-parameters to match different latency constraints of application scenarios. To compensate for the potential inference accuracy loss, we further propose Inception Distillation to exploit the multi scale reception information and improve the inference performance. Extensive experiments are conducted on four public datasets with different scales and characteristics, and the experimental results show that our proposed inference acceleration framework outperforms the SOTA graph inference acceleration baselines in terms of both accuracy and efficiency. In particular, the advantage of our proposed method is more significant on larger-scale datasets, and our framework achieves $75\times$ inference speedup on the largest Ogbn-products dataset.

translated by 谷歌翻译

Improving Document Image Understanding with Reinforcement Finetuning

Bao-Sinh Nguyen , Dung Tien Le , Hieu M. Vu , Tuan Anh D. Nguyen , Minh-Tien Nguyen , Hung Le

分类：计算机视觉 | 机器学习

2022-09-26

成功的人工智能系统通常需要大量标记的数据来从文档图像中提取信息。在本文中，我们研究了改善人工智能系统在理解文档图像中的性能的问题，尤其是在培训数据受到限制的情况下。我们通过使用加强学习提出一种新颖的填充方法来解决问题。我们的方法将信息提取模型视为策略网络，并使用策略梯度培训来更新模型，以最大程度地提高补充传统跨凝结损失的综合奖励功能。我们使用标签和专家反馈在四个数据集上进行的实验表明，我们的填充机制始终提高最先进的信息提取器的性能，尤其是在小型培训数据制度中。

translated by 谷歌翻译